Method and system for sequence presentation

ABSTRACT

Based on a partial sequence of a target gene having an unidentified sequence, a partial sequence corresponding thereto is extracted from a genome sequence by homology search on a database. Exon regions are predicted using plural programs, respectively, and common sequences among the predicted exon regions are extracted. A set of primers is designed based on a combination of a 5′ end sequence and a 3′ end sequence selected from the plurality of common sequences. Amplification using the selected combination of 5′ end and 3′ end sequences as a set of primers results in an amplified gene that may be cloned. A display system for preparing primer sequences and utilizing the prepared primer sequences.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to a method for predicting exon sequences of a gene based on enormous amounts of genome information of eukaryotic organisms. More specifically, it relates to a display system for preparing primer sequences and utilizing the prepared primer sequences.

[0003] 2. Description of the Related Art

[0004] Information on genome base sequences of human and other organisms has been accumulated. In addition, sequences of human gene regions are translated into proteins which perform vital functions, i.e., sequences of cDNAs are now being clarified by action of a genome project that utilizes technologies for acquiring cDNAs in Japan, and information thereon is being accumulated.

[0005] To analyze the function of a gene, it is essential to analyze the function of the protein translated from the gene, as well as to read its sequence information.

[0006] Accordingly, it is important to clarify and investigate a novel gene function using a cDNA clone as a starting material, which cDNA clone is obtained based on the sequence information as character strings.

[0007] Polymerase chain reactions (PCRs) are widely used to acquire cDNA clones of a gene that has been completely clarified on its sequence information. A cDNA of a coding region of the target gene can relatively easily be obtained by performing PCR using a cDNA prepared from a mRNA and two primers having outside sequences sandwiching the identified coding sequence. However, if sequence information of the target gene has been clarified only partially, sequences sandwiching the target coding sequence are not identified, and a cDNA of the coding region cannot be obtained by the conventional PCRs.

[0008] As a possible solution to identify a sequence of an unidentified region based on a partial sequence, rapid amplification of cDNA ends (RACE) has been proposed. In RACE, an oligo DNA having an identified sequence is artificially added to the 5′ end of a cDNA upon or after the preparation of the cDNA from a mRNA, the unidentified region is then amplified by PCR using primers corresponding to an identified partial sequence sandwiching the target unidentified sequence and to a sequence of the artificially added oligo DNA, and a cDNA of the target gene is obtained (FIG. 6).

[0009] According to RACE, however, the total length of a cDNA of the target gene cannot be predicted, since the sequence of the target gene is not identified. Accordingly, whether or not a DNA fragment length amplified by PCR is a target region cannot be determined. The unpredictability of the amplification length disables setting of PCR conditions. This is because the temperature and time of PCR directly affect the amplification length and amplification efficiency.

[0010] In addition, if the target gene has a very long sequence, all primers and enzymes used in the reaction must be optimized, which further complicates acquisition of the cDNA of the coding region of the target gene having a long sequence.

[0011] The RACE requires a reaction process step 45 for adding an oligo DNA artificially to the 5′ end of cDNA as an essential step (FIG. 9). The reaction efficiency in the reaction process step affects the amplification efficiency, since molecules of the resulting cDNA of the target gene including the added oligo DNA serve as a template in amplification. In particular, if the target gene is one in which only a very small amount of a mRNA is expressed, the target gene cannot significantly be obtained by the RACE, since the number of molecules of the oligo DNA before the addition reaction is small. As thus described above, a gene having an unidentified sequence has been obtained by the RACE in many cases but is hardly obtained by this technique as compared with a gene having a completely identified coding sequence.

[0012] Sequence information on regions other than the identified partial sequence cannot be obtained according to RACE unless a PCR can be performed between the oligo DNA added to the cDNA and the identified partial sequence in the reaction process step, in which the oligo DNA is artificially added to the 5′ end of the cDNA.

SUMMARY OF THE INVENTION

[0013] The present invention provides a method for presentation of sequences to prepare a common sequence. In this method, initially, a predetermined partial sequence is extracted from a mRNA. A partial sequence of a genome sequence corresponding to the partial sequence of the mRNA is then extracted by homology search on a database. Exon regions within the partial sequence of the genome sequence are predicted using computer programs. In this process, respective exon regions are predicted through the use of respective computer programs. A sequence common to the exon regions predicted through the use of the respective computer programs is prepared as a common sequence.

[0014] Such databases include, for example, Gene Bank. Such programs include, for example, GenScan, and FGENESH.

[0015] The common sequence thus prepared is a common exon sequence extracted through the use of plural programs and has-high reliability. The 5′ end and the 3′ end of the common sequence can be used as primers for amplification by PCR. According to the conventional techniques, a primer is designed only directed to the oligo DNA artificially added to the end of the cDNA. In contrast, plural sets of primers can be obtained according to the present invention when plural common sequences are obtained, which increases probability of amplification by PCR.

[0016] To increase the reliability, the number of the plural programs is preferably large particularly in the case where calculation can be performed on plural types of programs simultaneously. However, if calculation cannot be performed on plural types of programs simultaneously, the upper limit of the number of the programs is preferably set in view of saving time.

[0017] After obtaining the plural common sequences, one or more primers in a sense direction and in an antisense direction, respectively, in each common sequence are designed. A sense primer designed in a common sequence and an antisense primer designed in the same or another common sequence are linked to yield one set of primers (hereinafter referred to as “predicted set of primers”). In this manner, plural predicted sets of primers between common sequences are prepared. The use of plural sets of primers increases probability of amplification in the subsequent PCR process step as compared with the use of a single set of primers.

[0018] The combinations of the sets of primers are arbitrarily selected based on PCR amplification lengths.

[0019] In the above method, common sequences alone are extracted. It is also acceptable that minority sequences are extracted in addition to the common sequences. The term “minority sequence” as used herein means a sequence of an exon region, which exon can be predicted through the use of one program but cannot be predicted through the use of another program. Such minority sequences are preferably extracted when the extracted common sequences cannot yield sufficient information, i.e., when only a few types of common sequences can be extracted and a sufficient number of sets of primers cannot be obtained from combinations of the 5′ ends and the 3′ ends of the common sequences.

[0020] Upon selection of a set of primers, the size thereof is preferably set at about 500 to about 1000 bp. An excessively long amplification length deteriorates the amplification efficiency, and in contrast, an excessively short amplification length inhibits amplified products to be identified. The size of the set of primers is not specifically limited to those specified above and is arbitrarily selected depending on amplification conditions such as temperature, time, and type of enzyme used.

[0021] After designing the primers, an amplification reaction by PCR is performed using the designed primers with a template cDNA obtained from the mRNA by reverse transcription. The reaction mixture obtained as a result of amplification is subjected to electrophoresis to detect the presence or absence of amplified products. Plural amplified products are purified and are then subjected to sequencing. One sequence is defined as an overlapped region from among sequences of the plural amplification products. In the sequence, a coding sequence is determined, and a primer for amplification of the coding region is designed. The coding region is amplified using a cDNA as a template. The resulting amplified product and a cloning vector or an expression vector are ligated. The reaction mixture is introduced into Escherichia coli, and cDNA clone is obtained on a selection medium.

[0022] The present invention also provides a display system having the following configuration.

[0023] Initially, a partial sequence is extracted from a mRNA. A partial sequence of a genome sequence corresponding to the partial sequence of the mRNA is then determined by homology search on a genome database.

[0024] On a display screen, selecting means for selecting plural programs for predicting exon regions at the 5′ end, the 3′ end or both of the partial sequence is displayed. The selecting means determines programs that predict exons. The programs comprise two or more types of programs, such as GenScan, FGENESH, and Grail.

[0025] Respective exon regions predicted through the use of the programs are displayed on the screen on a program basis. The display system further includes a common sequence extraction button that extracts a region common to the respective exon regions predicted through the use of the respective programs. By this configuration, the procedure of extracting a common sequence can be visualized on the display system and can easily be performed.

[0026] When plural common sequences are extracted, the screen may display a selecting means for extracting a combination of the 3′ end and the 5′ end of any of the plural common sequences. The selecting means can determine a set of primers and mainly serves to select the length of a sequence to be amplified. The screen may display a box to select the length of the sequence. Alternatively, an operator arbitrarily selects the set of primers by double-clicking on a displayed primer region and the set of primers is selected in a region containing the target common sequence. On the display screen, primers not used may be omitted by filtering.

[0027] The display system preferably further includes a sequence display means for displaying the sequence of the selected set of primers. Subsequent to the above procedures, one prepares the target primers and then performs PCR and other operations. By displaying the sequence of the set of primers, one can prepare the primers with reference to the sequence with very good workability.

[0028] The display system preferably further includes a sequence display means for displaying the sequence of a region to be sandwiched between the selected set of primers. The sequence of the set of primers can be output as data to be compared with actually determined data.

[0029] In addition, the display system may preferably include a minority sequence extraction button for extracting an exon region that is predicted through the use of a predetermined program but is not predicted through the use of another program. When the use of a selected set of primers obtained by extracting the common sequences alone does not contribute to amplification, the minority sequence should preferably be extracted.

[0030] By extracting common sequences, reliability on exon sequence information is improved, and a large number of candidates for sets of primers can be obtained. Accordingly, a cDNA clone of a total coding region can be obtained from a partial sequence of a gene having an unidentified sequence by PCR with an almost equivalent efficiency to the acquisition of a cDNA clone of a gene having an identified sequence.

[0031] Other and further objects, features and advantages of the invention will appear more fully from the following description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0032] This invention is to be described specifically for preferred embodiments with reference to the drawings. Throughout the drawings for explaining the preferred embodiments, those having identical functions carry the same reference numerals, for which duplicate explanations have been omitted, wherein:

[0033]FIG. 1 is a schematic diagram showing a cloning process using a database according to the present invention;

[0034]FIG. 2 shows process steps to design of a predicted set of primers from partial sequence information;

[0035]FIG. 3 is a flow chart of process steps for designing the predicted primers;

[0036]FIG. 4 shows process steps of selecting common sequences and of selecting sets of primers designed in the common sequences;

[0037]FIG. 5 illustrates process steps of performing a PCR using the designed sets of primers, of determining a coding region of a target gene, and of cloning based on information thereof;

[0038]FIG. 6 illustrates a display screen image of Basic Local Alignment Search Tool (BLAST) search for a partial sequence;

[0039]FIG. 7 schematically illustrates information of position of the predicted exon sequence as a box object in a window showing the prediction;

[0040]FIG. 8 shows process steps of identifying a genome region of a partial sequence, of predicting exon sequences using prediction programs, and of identifying common sequences among the exon sequences predicted through the use of plural prediction programs;

[0041]FIG. 9 shows process steps of RACE as a conventional technique; and

[0042]FIG. 10 shows the result of BLAST search of a partial sequence in an example of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

[0043] It is to be understood that the figures and descriptions of the present invention have been simplified to illustrate elements that are relevant for a clear understanding of the present invention, while eliminating, for purposes of clarity, other elements that may be well known. Those of ordinary skill in the art will recognize that other elements are desirable and/or required in order to implement the present invention. However, because such elements are well known in the art, and because they do not facilitate a better understanding of the present invention, a discussion of such elements is not provided herein. The detailed description the preferred embodiments of the present invention will be provided herein below with reference to the attached drawings.

[0044]FIG. 1 schematically illustrates cloning process steps using a database according to the present invention. An organism synthesizes (transcripts) a mRNA from a genome sequence 50. The mRNA will serve as a template in protein synthesis (translation). The genome sequence 50 includes “exon” regions carrying genetic information necessary for protein synthesis and “intron” regions carrying no such information. The two types of regions are arrayed alternately. Initially, a mRNA 51 including the both regions is transcribed. The mRNA 51 then undergoes processing to remove introns to thereby form a mRNA 52 carrying continuous sequence information comprising the exons alone. The mRNA 52 containing no introns serves as a template in protein synthesis.

[0045] A partial sequence 54 of a gene having unidentified sequence is experimentally obtained from part of a mRNA 53 containing no introns. The mRNA 53 having the partial sequence 54 has been synthesized (transcribed) from the genome 50, and it should be understood that the genome 50 carries a region having the partial sequence 54. By searching an accumulated database of genome sequences for a region corresponding to the partial sequence 54, a genome region at which the transcribed gene is present can be identified. However, the identified genome region carries a sequence containing exons and introns arrayed alternately.

[0046] Accordingly, the present invention provides a method for cloning a gene having an unidentified sequence. In this method, a sequence comprising exons alone is extracted from the genome sequence using programs for predicting exons and introns to thereby predict sequence information of a mRNA containing no introns. Primers are then designed based on the sequence information and are subjected to amplification to thereby clone the gene having an unidentified sequence.

[0047]FIG. 8 shows process steps according to the present invention of identifying a genome region of a partial sequence, of predicting exon sequences using prediction programs, and of specifying common sequences among the exon sequences predicted through the use of plural prediction programs. A genomic database is searched for partial sequence information 61 of the gene having an unidentified sequence, and a genome sequence 62 containing the partial sequence is extracted. Exon sequences are predicted (hereinafter referred to as “predicted exon sequences”) based on the genome sequence 62 using plural exon prediction programs. The predicted exon sequences determined through the use of the respective programs are compared with one another to thereby extract common sequences among the predicted exon sequences (hereinafter referred to as “common sequences”).

[0048]FIG. 2 shows process steps according to the present invention for the design of a predicted set of primers based on the partial sequence information.

[0049]FIG. 2 illustrates an input device 1, an output device 2, and processing units (CPUs and memories) 10 and 20 for primer design. The processing unit 10 comprises a partial sequence input unit 11 for entering the partial sequence of the target gene, a homology search unit 12 for searching the genomic database, an exon prediction unit 13 for predicting exons in the predicted genome sequence, a comparison and extraction unit 14 for comparing the predicted exon sequences and extracting the common sequences, a designing unit 15 for designing primers in the common sequences, a computing unit 16 for calculating relative relations between the common sequences and distances between individual primers, and a primer set extraction unit for extracting sets of primers for the amplification with optimum predicted lengths. The processing unit 20 comprises plural programs 21 and databases 22 on sequences. The programs 21 have algorithms for prediction of exons and are used in the processing in the exon prediction unit 13 in the processing unit 10. The databases 22 are used for the prediction of the exons.

[0050] The process steps of identifying the position of the target gene on the genome based on the partial sequence information, of extracting the predicted exon sequences from the genome sequence, and of designing primers are illustrated in detail below.

[0051]FIG. 3 is a flow chart of the design process steps for the predicted primers, and FIG. 4 shows process steps of selecting common sequences and of selecting sets of primers designed in the common sequences.

[0052] With reference to FIGS. 3 and 4, a genomic database is homology-searched for a partial sequence X, a predicted genome sequence Y containing the partial sequence X is extracted, and the sequence datum thereof is written into another file (Step S11 in FIG. 3 and Step 31 in FIG. 4). The predicted genome sequence Y is a continuous genome sequence straddling the partial sequence X and is extracted from the sequences on the database.

[0053] The predicted genome sequence Y is subjected to two exon prediction programs and thereby the predicted exon sequences are extracted from the predicted genome sequence Y and are written into other files (Step S12 in FIG. 3 and Step 32 in FIG. 4). The two exon prediction programs used herein are GenScan [Burge, C. and Karlin, S. (1997) “Prediction of complete gene structures in human genomic DNA” J. Mol. Biol. 268: 78-94] and FGENESH [Salamov A. A., Solovyev V. V., (1999), unpublished data, refer to Kulp, D., Haussler, D., Reese, M. G., and Eeckman, F. H. (1996), Proc. Conf. on Intelligent Systems in Molecular Biology, 134-142]. By subjecting the predicted genome sequence Y to plural exon prediction programs, extraction of false positive sequences and omission of false negative sequences can be prevented. These problems occur when the sequences are extracted by using only one prediction algorithm of one program.

[0054] Next, the respective predicted exon sequences written into the files extracted through the use of the plural programs are compared with one another, and common sequences Z are written into another file (Steps S13 to S15 in FIG. 3). In this procedure, the seriality (order) of the respective common sequences Z keeps the seriality in the genome sequence Y (Step 33 in FIG. 4).

[0055] Subsequently, primers are designed in a sense direction and in an antisense direction in all the common sequences Z, and the sequences of the designed primers are written into a file (Steps S16 and S17 in FIG. 3 and Step 34 in FIG. 4).

[0056] Approximate predicted sizes of PCR amplification products of all the combinations of any designed primer (in a sense direction) of any common sequence and another designed primer (in an antisense direction) of any other common sequence are calculated based on the positional relations among the common sequences Z, and sets of primers to have predetermined amplified sizes and their predicted amplified sizes are listed (Step S18 in FIG. 3 and Step 36 in FIG. 4).

[0057] In addition, an approximate predicted size of a PCR amplification product using a sense primer and an antisense primer in any one common sequence is calculated, this procedure is repeated, and the resulting sets of primers and their predicted amplified sizes are listed (Step S19 in FIG. 3 and Step 35 in FIG. 4).

[0058] cDNA of a coding region of the target gene is then cloned using primers belonging to the sets of primers thus designed and output. The process steps for this cloning procedure are illustrated in detail below.

[0059]FIG. 5 illustrates process steps of performing a PCR using the designed sets of primers, of specifying a coding region of a target gene, and of cloning based on information thereof.

[0060] Initially, PCRs are performed using the sets of primers having the designed sequences, and the presence or absence of PCR amplification products are detected by electrophoresis (Step 37 in FIG. 5).

[0061] The respective amplification products are purified and are then subjected to cycle sequencing to thereby yield sequencing samples, and the sequencing samples are subjected to a sequencer to identify the sequences.

[0062] The resulting sequence data are linked at the common sequence regions to thereby yield one serial sequence (Step 38 in FIG. 5). In addition, a coding sequence in the serial sequence is determined (Step 40 in FIG. 5).

[0063] Primers corresponding to sequences outside the coding region are designed, and the designed primers are subjected to a PCR using a cDNA as a template to thereby amplify the coding region (Steps 40 and 41 in FIG. 5).

[0064] The resulting amplified product is ligated with a cloning vector or an expression vector and is introduced into Escherichia coli (Step 42 in FIG. 5). The treated Escherichia coli is then cultured on an antibiotic-added agar medium. By action of an antibiotic resistance marker coded on the vector, Escherichia coli carrying the vector can selectively grow on the antibiotic-added agar medium, and thus the coding region of the target gene cloned on the vector is cloned (Step 43 in FIG. 5).

[0065] The present invention will be illustrated in further detail with reference to several examples below, which are not intended to limit the scope of the invention.

EXAMPLE 1 Method for Presenting Predicted Sequences

[0066] 1. Determination of Genome Sequence and Extraction of Predicted Sequences from Partial Sequence of Unidentified Human Gene.

[0067] A partial sequence (SEQ ID NO: 1) experimentally obtained from a human mRNA was subjected to BLAST search [Altschul, S. F., Gish, W., Miller, W., Myesrs, E. W., and Lipman, D. J. (1990) “Basic local alignment search tool” J. Mol. Biol. 215:403-410] on GenBank (http://www.ncbi.nlm.nih.gov/Genbank/index.html). As a result, the partial sequence coincided with part of a genome sequence (FIG. 10). The partial sequence in question is a sequence corresponding to 45076-45230 of AL365356 and is represented by SEQ ID NO: 2.

[0068] The obtained genome sequence (SEQ ID NO: 2) was then subjected to two exon prediction programs to thereby extract exon sequences predicted from the genome sequence. The two exon prediction programs used herein are GenScan [Burge, C. and Karlin, S. (1997) “Prediction of complete gene structures in human genomic DNA” J. Mol. Biol. 268: 78-94] and FGENESH [Salamov A. A., Solovyev V. V., (1999), unpublished data, refer to Kulp, D., Haussler, D. , Reese, M. G. , and Eeckman, F. H. (1996), Proc. Conf. on Intelligent Systems in Molecular Biology, 134-142].

[0069] The predicted exon sequences are as follows.

[0070] [Exon Sequences Predicted by GENSCAN]

[0071] Exon A1 (SEQ ID NO: 3)

[0072] Exon A2 (SEQ ID NO: 4)

[0073] Exon A3 (SEQ ID NO: 5)

[0074] Exon A4 (SEQ ID NO: 6)

[0075] Exon A5 (SEQ ID NO: 7)

[0076] Exon A6 (SEQ ID NO: 8)

[0077] Exon A7 (SEQ ID NO: 9)

[0078] Exon A8 (SEQ ID NO: 10)

[0079] Exon A9 (SEQ ID NO: 11)

[0080] Exon A10 (SEQ ID NO: 12)

[0081] Exon A11 (SEQ ID NO: 13)

[0082] Exon A12 (SEQ ID NO: 14)

[0083] Exon A13 (SEQ ID NO: 15)

[0084] Exon A14 (SEQ ID NO: 16)

[0085] Exon A15 (SEQ ID NO: 17)

[0086] Exon A16 (SEQ ID NO: 18)

[0087] [Exon Sequences Predicted by FGENESH]

[0088] Exon B1 (SEQ ID NO: 19)

[0089] Exon B2 (SEQ ID NO: 20)

[0090] Exon B3 (SEQ ID NO: 21)

[0091] Exon B4 (SEQ ID NO: 22)

[0092] Exon B5 (SEQ ID NO: 23)

[0093] Exon B6 (SEQ ID NO: 24)

[0094] Exon B7 (SEQ ID NO: 25)

[0095] Exon B8 (SEQ ID NO: 26)

[0096] Exon B9 (SEQ ID NO: 27)

[0097] Exon B10 (SEQ ID NO: 28)

[0098] 2. Extraction of Common Sequences of Predicted Exon Sequences

[0099] The exon sequences extracted from the genome sequence were compared using individual prediction programs to find that combinations of Exon A1 with Exon B1, Exon A6 with Exon B3, Exon A8 with Exon B4, Exon A9 with Exon B5, Exon A11 with Exon B6, Exon A12 with Exon B7, Exon A13 with Exon B8, Exon A15 with Exon B9, and Exon A16 with Exon B10 have common sequences, respectively. These common sequences were then extracted.

[0100] Common sequence C1 between Exon A1 and Exon B1 (SEQ ID NO: 29)

[0101] Common sequence C2 between Exon A6 and Exon B3 (SEQ ID NO: 30)

[0102] Common sequence C3 between Exon A8 and Exon B4 (SEQ ID NO: 31)

[0103] Common sequence C4 between Exon A9 and Exon B5 (SEQ ID NO: 32)

[0104] Common sequence C5 between Exon A11 and Exon B6 (SEQ ID NO: 33)

[0105] Common sequence C6 between Exon A12 and Exon B7 (SEQ ID NO: 34)

[0106] Common sequence C7 between Exon A13 and Exon B8 (SEQ ID NO: 35)

[0107] Common sequence C8 between Exon A15 and Exon B9 (SEQ ID NO: 36)

[0108] Common sequence C9 between Exon A16 and Exon B10 (SEQ ID NO: 37)

EXAMPLE 2 Primer Design

[0109] 1. Primer Design Based on Common Sequences

[0110] Sense primers and antisense primers of the common sequences were designed using a primer designing software Oligo (available from Molecular Biology Insights, Inc.). In this procedure, a sense primer and an antisense primer were designed upstream and downstream of a target common sequence in the opposite directions to each other.

[0111] The sense primers F and antisense primers R designed on the common sequences are as follows.

[0112] a. Designed from Common Sequence C1 (SEQ ID NO: 29) Primer C1-F: 5′-GAAACAGTGATTATGAACACCG-3′ (SEQ ID NO:38) Primer C1-R: 5′-GCGACCGAGCCGGGAGT-3′ (SEQ ID NO:39)

[0113] b. Designed from Common Sequence C2 (SEQ ID NO: 30) Primer C2-F: 5′-GGAGCGGACCCCTGTGC-3′ (SEQ ID NO:40) Primer C2-R: 5′-CAGCCGCCAGCAGCAG-3′ (SEQ ID NO:41)

[0114] C. Designed from Common Sequence C3 (SEQ ID NO: 31) Primer C3-F: 5′-CGCAACATCGACGGCAG-3′ (SEQ ID NO: 42) Primer C3-R: 5′-CAGGGGGGACGCTGTGTA-3′ (SEQ ID NO: 43)

[0115] d. Designed from Common Sequence C4 (SEQ ID NO: 32) Primer C4-F: 5′-TGTGTGAGCCTTCTTATTGACG-3′ (SEQ ID NO:44) Primer C4-R: 5′-GCAGCACTTTGACACAGTCCAG-3′ (SEQ ID NO:45)

[0116] e. Designed from Common Sequence C5 (SEQ ID NO: 33) Primer C5-F: 5′-GAGACTGCCCTTCACCACG-3′ (SEQ ID NO:46) Primer C5-R: 5′-AGCACTTGGCGGGAGC-3′ (SEQ ID NO:47)

[0117] f. Designed from Common Sequence C6 (SEQ ID NO: 34) Primer C6-F: 5′-CCGGACCGTGGCTGC-3′ (SEQ ID NO:48) Primer C6-R: 5′-GGGCAATGCTGGGCAC-3′ (SEQ ID NO:49)

[0118] g. Designed from Common Sequence C7 (SEQ ID NO: 35) Primer C7-F: 5′-TACAGAACCTACCCTCTCAATG-3′ (SEQ ID NO:50) Primer C7-R: 5′-CTGCACCTGGGGCCTGT-3′ (SEQ ID NO:51)

[0119] h. Designed from Common Sequence C8 (SEQ ID NO: 36) Primer C8-F: 5′-TGATGCCAACTTCAGCACC-3′ (SEQ ID NO:52) Primer C8-R: 5′-CCCGTGGACAGCGTCTG-3′ (SEQ ID NO:53)

[0120] i. Designed from Common Sequence C9 (SEQ ID NO: 37) Primer C9-F: 5′-GTTTCTTCTAGGCAGTTGAGTTC-3′ (SEQ ID NO:54) Primer C9-R: 5′-CCTTCAAGCCAAAATCACTGAG-3′ (SEQ ID NO:55)

[0121] 2. Selection of Sets of Primers for Use in PCR

[0122] On the assumption that the common sequences are sequences of the gene to be cloned, predicted PCR amplification lengths were calculated on all the combinations of the designed sense primers and antisense primers, and 18 sets of primers which were predicted to have amplification lengths of from 451 bp to 1057 bp were selected. The combinations are as follows.

[0123] Primer C1-F and Primer C1-R (predicted amplification length: 623 bp)

[0124] Primer C1-F and Primer C2-R (predicted amplification length: 897 bp)

[0125] Primer C1-F and Primer C3-R (predicted amplification length: 1037 bp)

[0126] Primer C2-F and Primer C4-R (predicted amplification length: 451 bp)

[0127] Primer C2-F and Primer C5-R (predicted amplification length: 634 bp)

[0128] Primer C2-F and Primer C6-R (predicted amplification length: 691 bp)

[0129] Primer C2-F and Primer C7-R (predicted amplification length: 921 bp)

[0130] Primer C2-F and Primer C8-R (predicted amplification length: 1057 bp)

[0131] Primer C3-F and Primer C6-R (predicted amplification length: 504 bp)

[0132] Primer C3 F and Primer C7-R (predicted amplification length: 734 bp)

[0133] Primer C3-F and Primer C8-R (predicted amplification length: 870 bp)

[0134] Primer C4-F and Primer C7-R (predicted amplification length: 584 bp)

[0135] Primer C4-F and Primer C8-R (predicted amplification length: 720 bp)

[0136] Primer C4-F and Primer C9-R (predicted amplification length: 1002 bp)

[0137] Primer C5-F and Primer C8-R (predicted amplification length: 570 bp)

[0138] Primer C5-F and Primer C9-R (predicted amplification length: 852 bp)

[0139] Primer C6-F and Primer C9-R (predicted amplification length: 685 bp)

[0140] Primer C7-F and Primer C9-R (predicted amplification length: 613 bp)

EXAMPLE 3 Display System

[0141] 1. Schematic Display of BLAST Search and Exon Prediction:

[0142] (1) The partial sequence in question is input into Query window of a BLAST search screen (FIG. 6), and the lower limit of E value is input into an E value window to narrow the scope of the search.

[0143] (2) A click on a search button triggers a BLAST search, and a list of the results and regions homologous to the partial sequence are shown in a BLAST search output window.

[0144] (3) With a click on a check box on the left side of the BLAST search output based on information on the homologous regions to the partial sequence, a genome sequence to be adapted is selected.

[0145] (4) A click on a selection button makes a region on the adapted genome sequence including the partial sequence automatically undergo the exon prediction programs.

[0146] (5) Information on origin of the adapted genome sequence, the position of the partial sequence with respect to the genome sequence, and position information of exon sequences predicted through the use of the respective programs are schematically shown as box objects on a prediction output window 70. (FIG. 7).

[0147] (6) On the prediction output window 70, outputs of the plural prediction programs can be compared concurrently. By unmarking a checkbox on the left side of the name of each prediction program, the result of the prediction program showing an extremely different disappears from the screen.

[0148] (7) With a click on “Extract Common Sequence Alone” button, common sequences common to the exon sequences predicted through the use of the respective programs are schematically shown as box objects on a common sequence window.

[0149] (8) At the same time, sense primers and antisense primers are automatically designed on the respective common sequences and are schematically shown as arrow objects on the common sequence window.

[0150] (9) A range of PCR amplification lengths is input in number input windows 71, 72 with a click on a calculation button. By this procedure, sets of primers which are predicted to have amplification lengths within the calculated range are selected from combinations of the sense-primers and antisense primers designed in the step (8). The selected sets of primers and their predicted amplification lengths are schematically shown on a primer set window 74.

[0151] (10) By marking one of the checkboxes 75 on the left side of each of the sets of primers, desired sets of primers can be selected. The names, sequences, numbers of bases, melting temperatures (Tms), and predicted amplification lengths of the selected sets of primers are listed.

[0152] (11) After the selection of sets of primers and display of the primer list, the “OUTPUT TO FILE” button 76 is selected to store the primer list as a file.

[0153] (12) With a click on any of the schematic displays (the genome sequence, partial sequence, exon sequences predicted through the use of the programs, box objects of the common sequences, and arrow objects of the primer sequences), a sequence corresponding to the clicked object 78 is displayed on a sequence window 77. With a click on an object of a common sequence, a predicted exon sequence from which the common sequence is derived is shown in box 77. In contrast, with a click on an object of a predicted exon sequence, a common sequence derived from the clicked predicted exon sequence is shown in box 77.

[0154] (13) In this procedure, a minority sequence is defined as a sequence which is not predicted by at least one program, among the exon sequences predicted through the use of the plural programs. Primers may be designed also on such a minority sequence.

[0155] (14) Primers are synthetically prepared based on the resulting sets of primers and are subjected to PCRs.

EXAMPLE 4 Determination of Coding Sequence

[0156] 1. PCRs Using Sets of Primers

[0157] 1-1. Reverse Transcription of mRNA (cDNA Synthesis)

[0158] A total of 250 ng of brain mRNA (available from Clontech) was added to 10 pmol of an oligo d(T) primer (SEQ ID NO: 56), sterile water was added to the resulting mixture to a total volume of 4.9 ml, the mixture was left stand at 70° C. for 10 minutes and was then cooled on ice. To the resulting mixture, 2.4 ml of 25 mM MgCl₂, 1.0 ml of 10 mM dNTPs, 0.2 ml of 0.1 M DTT, and 1.0 ml of 10 times reverse transcription (10×RT) buffer [200 mM Tris-HCl (pH 8.4), and 500 mM KCl] were added to a total volume of 9.5 ml. The mixture was warmed at 42° C. for 5 minutes and was then subjected to a reverse transcription reaction at 42° C. for 60 minutes with 25 units (0.5 ml) of SuperScript II reverse transcriptase (available from GIBCO BRL). The reaction mixture was heated at 70° C. for 15 minutes to deactivate SuperScript II to thereby terminate the reaction. RNAs in the reaction mixture were decomposed with 0.5 ml of RNAse by warming at 37° C. for 20 minutes.

[0159] 1-2. PCR Using cDNA as Template

[0160] On each of the sets of primers, 10 pmol each of a sense primer and an antisense primer, 3.2 ml of 2.5 mM dNTPs, 2 ml of 10×buffer [final concentration: 10 mM Tris-HCl (pH 8.3), 50 mM KCl, and 1.5 mM MgCl₂], and 0.5 unit of Taq DNA polymerase (available from Perkin Elmer, Inc.) were mixed, sterile water was added thereto to a total volume of 20 ml, followed by PCR using 0.4 ml of the cDNA prepared in the step 1-1. In the PCR, a first denaturation was performed at 94° C. for 5 minutes. Subsequently, an amplification reaction was performed by repeating, 35 times, a cycle of denaturation at 94° C. for 30 seconds, annealing at 55° C. to 60° C. for 30 seconds, and elongation at 72° C. for 1 minute. At last, elongation was performed at 72° C. for 5 minutes.

[0161] A total of 2 ml of PCR products was subjected to electrophoresis on 1% agarose gel to detect amplified bands.

[0162] Amplified products in the following sets of primers were detected.

[0163] Primer C2-F and Primer C4-R

[0164] Primer C2-F and Primer C5-R

[0165] Primer C2-F and Primer C6-R

[0166] Primer C2-F and Primer C7-R

[0167] Primer C3-F and Primer C6 R

[0168] Primer C3-F and Primer C7-R

[0169] Primer C4-F and Primer C7-R

[0170] 2. Sequencing of PCR Amplified Products

[0171] 2-1. Purification of PCR Amplified Products

[0172] The total amount of the seven PCR amplified products was subjected to electrophoresis on 1% agarose gel, target amplified products were cut from the gel and were purified using QIAquick Gel Extraction Kit (available from QIAGEN) in accordance with a protocol in a manual attached to the kit.

[0173] 2-2. Sequencing of PCR Amplified Products

[0174] Each of the PCR amplified products was subjected to cycle sequencing and purification using 100 ng of the purified PCR amplified product as a template and 3.2 pmol of the sense primer used in the PCR in question with ABI PRISM BigDye (TM) Terminators v2.0 Ready Reaction Cycle Sequencing Kit (available from Applied Biosystems) in accordance with a protocol in a manual attached to the kit.

[0175] The results of sequencing of the amplified products of the respective sets of primers are as follows.

[0176] Base sequence of the amplified product of Primer C2-F and Primer C4-R: (SEQ ID NO: 57)

[0177] Base sequence of the amplified product of Primer C2-F and Primer C5-R: (SEQ ID NO: 58)

[0178] Base sequence of the amplified product of Primer C2-F and Primer C6-R: (SEQ ID NO: 59)

[0179] Base sequence of the amplified product of Primer C2-F and Primer C7-R: (SEQ ID NO: 60)

[0180] Base sequence of the amplified product of Primer C3-F and Primer C6-R: (SEQ ID NO: 61)

[0181] Base sequence of the amplified product of Primer C3-F and Primer C7-R: (SEQ ID NO: 62)

[0182] Base sequence of the amplified product of Primer C4-F and Primer C7-R: (SEQ ID NO: 63)

[0183] 2-3. Assembling of Sequence Data and Determination of Coding Sequence

[0184] The sequences (SEQ ID NO: 57 to 63) of the amplified products of the respective sets of primers and the partial sequence (SEQ ID NO: 1) were subjected to assembling using an assembly software SEQUENCHER (TM) (available from Gene Codes Corporation) and thereby yielded a cDNA sequence (SEQ ID NO: 64) of 1251 bp. From the cDNA, a coding region (SEQ ID NO: 65) of 462 bp was identified.

EXAMPLE 5 Cloning Method

[0185] 1. PCR and Sequencing of Coding Region

[0186] 1-1. Design of Primers

[0187] Primers for the amplification of the coding region were designed based on the determined coding sequence (SEQ ID NO: 65).

[0188] 1-2. PCR Using cDNA

[0189] PCR was performed in the following manner. Using the cDNA prepared in the step 1-1 in Example 3 as a template, each 10 pmol of the primers prepared in the step 1-1 in Example 5, 3.2 ml of 2.5 mM dNTPs, 2 ml of 10×buffer (final concentration: 10 mM Tris-HCl (pH 8.3), 50 mM KCl, and 1.5 mM MgCl₂), and 0.5 unit of Taq DNA polymerase (available from Perkin Elmer) were mixed, sterile water was added thereto to a total volume of 20 ml. In the PCR, a first denaturation was performed at 94° C. for 5 minutes. Subsequently, an amplification reaction was performed by repeating, 35 times, a cycle of denaturation at 94° C. for 30 seconds, annealing at 55° C. to 60° C. for 30 seconds, and elongation at 72° C. for 1 minute. At last, elongation was performed at 72° C. for 5 minutes to thereby yield target PCR amplified products. A total of 2 ml of the PCR products was subjected to electrophoresis on 1% agarose gel to thereby detect the amplified products.

[0190] 2. Sequencing of PCR Amplified Products

[0191] 2-1. Purification of PCR Amplified Products

[0192] A total amount of the PCR amplified products was subjected to electrophoresis on 1% agarose gel, target amplified products were cut from the gel and were purified using QIAquick Gel Extraction Kit (available from QIAGEN) in accordance with a protocol in a manual attached to the kit.

[0193] 2-2. Sequencing of PCR Amplified Products

[0194] Each of the PCR amplified products was subjected to cycle sequencing and purification using 100 ng of the purified PCR amplified product as a template and 3.2 pmol of the sense primer used in the PCR in question with ABI PRISM BigDye (TM) Terminators v2.0 Ready Reaction Cycle Sequencing Kit (available from Applied Biosystems) in accordance with a protocol in a manual attached to the kit.

[0195] 2-3. Assembling of Sequence Data

[0196] 3. Subcloning of PCR Amplified Product

[0197] The PCR product amplified by the step 1 above was purified and was then subjected to ligation with pGEM-T vector. Part of the reaction mixture was mixed with 40 ml of competent cells (XL1-Blue), was heated at 42° C. and was inoculated to a Luria-Bertani (LB) agar medium containing 50 mg/ml ampicillin, followed by incubation at 37° C. for 17 hours to allow colonies to grow.

EXAMPLE 6 Results Presentation

[0198] Part of the colonies grown on the agar medium was cultured in 1 ml of a LB liquid medium at 37° C. for 17 hours under shaking. The resulting culture mixture was centrifuged with a centrifuge (available from Hitachi High-Technologies Corporation under the trade name of HIMAC CF15R) at 3000 rpm for 5 minutes to thereby remove the supernatant LB medium. A plasmid DNA was extracted according to the alkali-sodium dodecyl sulfate (alkali-SDS) method. Using the extracted plasmid DNA as a template, sequencing was performed with the primers designed within the coding region. The result of sequencing and the plasmid DNA were submitted.

[0199] While the present invention has been described with reference to what are presently considered to be the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. On the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the sprit and scope of the appended claims. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.

[0200] The foregoing invention has been described in terms of preferred embodiments. However, those skilled, in the art will recognize that many variations of such embodiments exist. Such variations are intended to be within the scope of the present invention and the appended claims.

[0201] Nothing in the above description is meant to limit the present invention to any specific materials, geometry, or orientation of elements. Many part/orientation substitutions are contemplated within the scope of the present invention and will be apparent to those skilled in the art. The embodiments described herein were presented by way of example only and should not be used to limit the scope of the invention.

[0202] Although the invention has been described in terms of particular embodiments in an application, one of ordinary skill in the art, in light of the teachings herein, can generate additional embodiments and modifications without departing from the spirit of, or exceeding the scope of, the claimed invention. Accordingly, it is understood that the drawings and the descriptions herein are proffered by way of example only to facilitate comprehension of the invention and should not be construed to limit the scope thereof.

1 65 1 154 DNA Homo sapiens 1 cacagaggag gctggcagag ctggggactg agggcattgt tgctgattct cactcaccgg 60 ggcagcctgc cgcagatgca caggccccag gtgcaggcca ccacctccgg gtcggcacca 120 ggactgccct cggtgctcat agggaatggc tggg 154 2 154 DNA Homo sapiens 2 cccagccatt ccctatgagc accgagggca gtcctggtgc cgacccggag gtggtggcct 60 gcacctgggg cctgtgcatc tgcggcaggc tgccccggtg agtgagaatc agcaacaatg 120 ccctcagtcc ccagctctgc cagcctcctc tgtg 154 3 12437 DNA Homo sapiens 3 tctgtaataa taagtaggaa accaagaatg tctctgtttt aagagtatcc cattgcccaa 60 tttataaaac ttagcaatac tgcaggactt cctccaaaaa atgccgtaag aaaaatacaa 120 agtgctgtgt ttagaaattt tttaaaaaaa gaagcgctat ggaagtggag agaagtgcag 180 gatgaaatct gctgccctca ggctgttctc agccagggcc tgagcgtggt ggaggaggag 240 tttgggtcat tcctgcccag agcagggctt ctctcacgga gaaccttcat gctgaggtgc 300 acatctccca gctgagactt tcctagagct ccatggtggt ctgagactct ccctacccaa 360 ttcttcctcc tcgtttccat gcacaagtgt caggtctgca ccacaatctg agactctccc 420 cgcctactct tgctctttcc cggtttattc tttacaggca tttcccccaa taaatcgcct 480 gtatgtctgc ttttcagaga ccaggaactg atacaagtaa atactaagag tggtccaaga 540 aaacagaagg ccagctgggg ttttggaatt gggtgactca cggtcctgat ggcaaaaagg 600 ggctcattcc aagaggaatg tgaggtatag gacagacctt ggtaacaagt ggtggcctga 660 atggtgacga tttcaccaga gttaacccag gagaagtgac gattatgcag gtgtgagctc 720 acaataagag aggtggtgac agagctgagg tactagaggc aggcggtctc caggggcagg 780 caggatctag ggtaatagag gcaggcgggg tctgatagaa gctgtcagaa gtcattactc 840 taatgactgg cagaggtgga gagacagccc acggggcctg gcccttagag ggggcagata 900 ggacatagta ttccaaaata aataggcagc tagggagggt tctacataat atctgcaact 960 aggagaaggc caggaaggag gagagggaag ctggaggtgg agatggagca gaagtagccc 1020 caattcctat cacttccacg gggagacatc cattcccaaa ctcccaagtt ctgccagggt 1080 ggagatcctg gtctgtctcc aaaaaaggtg tctcacatgt gccagaaaaa acaaaaagaa 1140 aaaattaaat aaccctaaaa tgtgtcctct tcccaggaga ctcagcacct gcccttcgaa 1200 tcccaagcta aagctgctgc cagggcactt agactcttgt gtcaaggcag cagcaggtgc 1260 gaacagctgt gactttcttg acagtagtga ttgctgccga tgggtaggag gaggtagagc 1320 tgtctttgaa cacgggaagg aacacttgta caatccaggg gtgcacttgg gggcctcttg 1380 gtactccctt gtcccattgt aactgttgat ggacacaggc agcaattcca gccttagaaa 1440 tgtatgatga ttatgagttc aggccccttg agtgtcaagg tttgagtcat gttacaaggt 1500 gagccatgaa gacttattga ggtgatagct aagagggata ggaaaattta aaatatggtg 1560 gaggaagaag aggatgaatg cccattgaag ccctgagact aactgcagag agagatccta 1620 tttgtaccac tgaccttctg tttctaagtt ccccattgag aggctcttgg gaatcataag 1680 gaaactgctc ccctaaatat gtacggagaa atagatttat ctggtggaag gaggtagact 1740 gtggtagtga tggaagtgat ctttctaaag agaagctgct ctgggagcag agttgactca 1800 cagattccaa cttccacccc actggagcca ccatgcctct cacacttcct ccaggatgtt 1860 cccagacagt ggctgagcac agcggggata cgagtgctgg tccaagcctg cctggaccct 1920 ccaatgggca acctttgctc aggcactctc catcagcctg gcaaaactac actcagacct 1980 gagctgtgat ctgagactcc ttatatccag accttcctta ttgctcttcc ttcacagata 2040 ccaggctttt atctccatct gcgcgttctc ccacaatacc cctgcacctc cccgtttact 2100 cttcatggaa ttttccctaa taaatctatt gcacgtctag cccatcttgg tgtctgtttc 2160 ttggagaact gaaactgaca cactgcccaa taagcctttg gggagcactc ctaaccctgg 2220 cttatttcat tgtttgcttc ccagagaaca tagcagcact cttgtaatct tactaaaaat 2280 gcaaatgctg ggccccacac ctgaagtttc tttttttttt ttgagacaga gtctcactcc 2340 attgcccagg ctggagtgca gtggcatgat cttggctcac tgcaacctct gcctcccagg 2400 ttcaagcaat tctcctgcct cagcctccca agtagctggg actacaggca cccgtcatca 2460 tgcctggata attttagtat ttttagtaga gacagggttt caccatgttg gccaggctgg 2520 tcttcaactc ctgaccttaa ctgatttgcc tgcctcggcc tcccaaagtg ctgggatttc 2580 aggtgtgagc cactgtgcct ggccttgaag ttctttttaa ctaggtaagg gtgggggccc 2640 aggaactggt ttttcttatt ggtgtctcag gtacctgtgt tgcagttaaa attagatgat 2700 agatactatt taattttcaa cactcattac tctaacacta tcaaactaat ataagagtac 2760 tgaaattaat ctttatattt aaaagaaaac aaaagaagaa attccagcca aaaggcacat 2820 gcttacccaa agggatctat ggattcaatg caatctgtgc caaaatactg atgatattct 2880 tcatagaaat ggaattaaaa accctaaaat ttatatggaa ccacaaaagt tcacaaatag 2940 ccaatacaat cttgatcaaa aagaacaaag ctaaggccag gctgatgcct gtaatcccag 3000 cactttgaca ggccaaggca ggtgaattgc ttgagtccag gagtttgaga cctgcctggg 3060 tgacatggtg aaaccccatc tctacaaaaa atacaaaaat taagccgtca tggtggtgct 3120 tgcctgtagt cccagctact caggaggctg aggtgggagg attgcttggg catgggagga 3180 tctcttgaga cagtgaggtg gaggttgcag tgagccaaga ttgcatcact gtactccagc 3240 ctgggtgaca gagcaagacc atgtctcaaa aaaacaaaac aaaacaggct gggtgcagtg 3300 gctcacgcct gtaatcccag cactttggga ggctgaggca gacggatcac gaagttagga 3360 gatgaagacc atcctggcca acatggtgaa accccgtctc tactaaaaat acaaaaatta 3420 gctaggcgtg atggtgtgca cctgtagtcc taactactca ggaggttgag gtaggagaat 3480 tgcttgaacc cgggaggcag aggttgcagt gagctgagat cgtgccactg cactccagcc 3540 tggtcaaaaa aaaaaaaaaa aaaaagaaat gataaatgtt tgaggtgata aatatgttaa 3600 ttaccctaat tttgttatta ctcattgtat gcatgtatca aacaatcata ccatacacca 3660 taaatataga taattattgt gtcaatttaa aataaaacct aaaaaggtca tgtgctgtaa 3720 tattacaaga aagagcaaaa cttttaagag ctaaggatga actgaaaaca ttgatagaat 3780 gtataaagag tggaattaat gccataaagc agggttgtca cactgtgccc cttggtcaaa 3840 tctggtttgc tgcctgtttt tatgagtttg tttgtttgtt tgtttttttg agacagagtc 3900 ttgctctgtt gcccaggctg gagtgcagtg gcgcgatctc ggctcactgc aagctccggc 3960 tcctgggttc atgccattct cctgcctcag cctcccgagt ttttatgagg ttttattgga 4020 actcagtcat acacatggat catgtattat gtatggctca gtaaacataa ggaggctctg 4080 tagccagacc acttggattc caattccaat tatgccactc actggttggg tgctcttggg 4140 caggtaattt agcctctctg tgcttaagta ttgttatctg caaaatggag atagtaacag 4200 gacctacctc atagggttaa tatgaggatc aaatgtgatc atgtatatac cataatttct 4260 tagcacatgg cctgagaggt gttaaaagag ttaaaaaagt gttattttaa tggatcggtt 4320 tgatcatccc tagaccaact catcaaattt attatcacaa aaagaaggaa agccagacat 4380 cataggcttc cagatttaat gcaattggaa gtttacaatc tatgaagtct tcttaccaaa 4440 aaacctggat ctaattaagc atctacacta tagatccacc ttccagtgta cagaaaatac 4500 ggggaacgaa gaacatgtta catgacacca caaggatgca atcagcaaaa tctagaattt 4560 ctttaaaaaa tacatcacaa gtttggcaca gtggtgcgtg cctgtagtct gttactcagg 4620 aggctgagac aggaggtgat atggtttgga tttgtgtccc cacccaaatt tcatgtcgaa 4680 ttgtaatccc caatgttgga ggaggagggg cctggtagga ggtgattgga tcatgggggc 4740 agatttccct cttgctgttc tcatgatggt gagttctcac gagatctgat tgttcaaacg 4800 tgtgtagcac ctccccttct ttctcttccc tgcttcccct ttaccttctg ccgtgattat 4860 aagtttccta aggcctccca cccatgattc ctgtacagcc tgcagaactg tgagccaatt 4920 aaacctcttt tgtttatata ttacccagtc tcaggtattt cttttctttt tttttttttt 4980 gagacggagt ctcgatctgt cgcccatgct ggagtgcagt ggcgcgatct cggctcactg 5040 caagctccgc cttccgggtt cacgccattc tcctgcctca gcctcctgag tagctgggac 5100 tacaggcgcc cgccaccgcg cctggctaat ttttttttta ttttttttat ttttagtaga 5160 gacggggttt caccgtgtta gccaggatgg tcttgatcac ctgacgtcgt gatccacctg 5220 cctcggcctc gtaaagtgct gggattacag gaatgagcca ccgtgcccgg tccagtccca 5280 ggtatttctt tatagcagtg agagaacgga cctaatacaa caggattgct tgagcccagg 5340 agttcaaggc tgtagtgtgg gaggatagtg cttatgaata gccactaaac tccagtctgg 5400 gcaacatcgg gagacccttt ctgtaaaaaa acaaaaacaa aaacaaaaaa agggtggaga 5460 ctgtcagatt ccaagtacag gtgaggtaga agggatgtgc tgtaaatata aaaaaacaca 5520 ggatggaagc aaaagaatga aaaaatccat acatgcaaac attaattgta agaaagctga 5580 agtggccata tttacatcag acaatgtaga cattctagca aaaatattac taaaaatttt 5640 actaaatatt actaaaaatt tactaaatta ctaaaattta ctaaattact aaaaatttac 5700 taaattacta aaaatttgct aaaaaattta ctgcaatatt actaaaaaaa gataaagagg 5760 gacatttagt aatgatcaaa ggatcaattt atcaagaaga tacggtatta atagtgtatg 5820 aatttagtaa aaaagcttca ataaacatac ggaaaaatgt gacagaactg aaaaagaaaa 5880 acacaaattc agattcacag ttggaaactg gcctcagtct ctcaggaaat tatagaagaa 5940 tatagaaaat cagaagggta gagaagactt gagtaacgtt gtcaaacaat ttgccctgac 6000 taaaatttat agacagttca cctgactgcc aaatatatgt tctattagtg tgtataggga 6060 acattcagta agatagacca tatgctgcac tataatataa atcttaataa acttaaaagg 6120 attgaaatac tatagagtat attatgtata gattctgtat taaattagaa ctcaatgaca 6180 agatgatatc tggaaaattt ccaaatactt ggaaattaga acacacatgt ctatataagc 6240 cacagatcca agaagagatc ataagaaatg ttagaaaatg ttttataccg aatgaaaatg 6300 aaatcataac attgaacttg tgggatgcag ctaaagcagt gattagaggg aaatttatgc 6360 ctttgtccac tttatcaatg cccaacctca ctggcatgac taactagcag gaaagctgat 6420 gttcttcacc ctggagccaa acacacacct aagactttga gaagctgaaa atagggagcc 6480 agtgacagcc aatggctgct tatattagaa aacgaaaaaa cgtttgaaat gaggtatatg 6540 ttgccttgaa aatccaagga agcctgtttg gtgccatgtt tctttcttag tggctggagg 6600 ggtcagatgc agccccttcc ctttactttg ggaggtgtat cttgacatgc cccagatcaa 6660 tagaaatttg aatttctgtc ccaccctctg gcatgcatgc gtcttactga aatgctgaca 6720 gtagacacta attcctgctc atcagatcca atcagcttgg tatcatcaaa acactcgatg 6780 ggagtcatag cctggtgaat ggagaggcag ccaagatttc tgcagatcag attatgacat 6840 aaaatcagaa ggttgataaa gttctgaggt gagagagtga tggtgaattg ctgccccagc 6900 cagctggaag caaactgttc ctcatggtct tcactaacca cgtgtaacca cgtagccaga 6960 gaacgcattc atcagaccaa tagctgaata ccagcaatga aaccgtttct gaaacagcag 7020 cttcacctct gatgaagttt ataataatcc actgtcattc tcccagaaac tgcctgtctt 7080 ctgcacaggc aaaattatgt aatttgaatg gaaatggggt gggtgggagc agggggagtc 7140 accatcctgc atctttcatg cttttttttt tgtttttttg tttttgtttt ttgttttttt 7200 atttttgaga cagagtctca ctctgtcgcc aggctggagt gtagtggcgt gatctcggtt 7260 cactgcaacc tctacctcct gggttcaagc gattgtcctg cctcagcctc ccgagtagct 7320 gggactacag gcttgcggca ccacgcccag ctaatttttg tagttttagt agagacgggg 7380 ttttaccatg ttggcaagga tggtcttgat ctcttgacct cgtgatccgt ctgcctcagc 7440 ctcccaaagt gctgggatta caggcacggt ggttcatgtt cttgatggta gcactaaact 7500 ctgcaattcc tccagcaatg tgaaatgcct ttttcctatg ttggtagata gaggcagttt 7560 cagtgccttc tgcttggctc agagaatcca tgtagggact ctgccagttt ctagtagata 7620 tattcaaatg atgcattctg gaactgaaga aataactaca ggattcaaga acccactggc 7680 ttatctgaga cgaacctatg cccaaactcg actgatcatt ttttctatca actagtattc 7740 tagctgatag acatagtggt attttagctc tccaggaatt agtgtcagtt cagagccatt 7800 gacccgcaat tgctgagtag tcgttatttc cctttcttca gtgcatagtc aatatacaac 7860 ctagtaaata tccatgagat tctttgggaa ggtcaggagg acgagttaca gtacaaactc 7920 tgtggctcag agttcaggtg tacctggacc cactttcatt caaggagctc tgggtctgtg 7980 aactggctca actctaggaa ttggttggga ggctgaaact ccctattaaa atgatttttc 8040 tttactgctc ttcgccagac ctagagcttt cctgcttgta aaaatcaagt aaggcttcag 8100 taggctgcca tgtgtttcag ctctagggat gctgtcactg actaatcatt gcccaagatc 8160 tctgaggaca gaccatctgg atgactgtgc cagttctgct gcccatgatg gcagctggtc 8220 ccacactggc tttggcaatt aagtgccacc actcagtcat cagattctat cattcccagt 8280 gaattcagcg agtaaagctt aatgacagca acttccccac catcattacc gacctagact 8340 ccccaaggat taagaagccc ccacccccga cccccaccgt cctgacaaat ggacttctca 8400 gtcttgtgag ggggttttct tcagattctg agagccagag ttaggcgatg ggtggacagg 8460 ttttacatga aaaatctact ccagcattct aagcctttag gagacatttt tccaaagtgt 8520 actgagaaag ttctggcatt tcacctgtag aaacatgcct agccatttga acaagcaaac 8580 tgactttaga attgtgatgt ggtaaacctg attaaaaatt ttggcctgat ccaacattat 8640 tttccttcca atctaatcca gtatcccttt gatccattcc tatataagtt ctacagattt 8700 ctgccaatgt cacttggcaa aaatcttgca acttttatgg tgcgtgttgt gccttattag 8760 gcctcaaacc cctctgaacc ttacttacct ttcagggccg ggcaggccga gtctgggtta 8820 caggtctaga agcagtgaca ggtgaggtgg ggggtagatc ctgaggagga tccccagctc 8880 tttgcaaatc agcttccact ggtgaggcca tcatagattc atcaggcaag attaaccctc 8940 tgagtaagca caggtggatt ccttctactg gccaaagaag gctcagcaga attcaggggt 9000 ttaaggtctt cagcttcatg aaaatctgcc catttgctcc cattccattc ttcagcatcc 9060 cactcctttc caattagttc tctaccttta ccatgagaga ctctgcagag ttgagaatag 9120 aatctgcatt gtttcagtca cctgcaagct tatactttga gtttagcttt cagaaatctt 9180 ggctgtgtgg ctacgagaga taagggtttc ttgcagggtg gccatataag atttctggtc 9240 cctggcctgg tctttgagct ggagaaaacc ctaaactgat gactttgttt tccttaaatc 9300 caatcagatc aggatttctt gtgccagcca agatggtgac agcccattac agtctctgtc 9360 tgccacagct acaaataaaa accagacaag ctaccaaaaa caacttccta agggctttgt 9420 tttttttttt ttaatttttt ttagagatgg gggactcact ttgttgtcca ggctggtctc 9480 aaactcctgg gtttaagcga ttttcacaag tagctgggat tacaggtgtg tgccactgca 9540 cccagcaact tcctaaggac tcggaaaagt taacaatggc aggaagattg gagagaggaa 9600 ccaaaacttg aagaacccta tatatttttc aatactctta tatttaggaa tagaatcagt 9660 caatataaat tgtaaatgaa atgcttcttc acagcctttg gaggatgggt tataccagtg 9720 attgccaaac ttgcagcaat agttaaaaaa aattataggg ctgcctcccc gtctctccct 9780 ggagcccaga atctccagga gaaacccaga aatatacatg tttaacaaaa ccttgatttc 9840 tcgagttagt aaatttagga aaaatggctt atccaagcaa atagaattgg atttcctaaa 9900 aatttagtcc tcacccaagc tatttatttt ggaggcaaaa gaagagtatc ttatgaccta 9960 gaagactcta aaaatgtggt atgaagactg tctttattag aaaaaccagg ggttaacaca 10020 tctcatttaa caatgtgtag ggggtcccca actacaccac tttccaaata gcggtcttat 10080 cagattttaa aatgctgata caaaagttat gatcttataa aatgacattt acttccaaca 10140 tggtgaaacc ccatctttac taaaaatatt ttaaaaacca gctgggtgtg gtggtgggca 10200 cctgtaatcc cagctactct ggaggctgag gccagagaat tgcttgaacc caggagacgg 10260 agatggcagt gagacaacac ggtgccactg cactctagcc tcggggacag agtgagactc 10320 cgtctcaaaa aacaacaaca acaaaaaacc acatgtggta gttagtttaa aaatcacaga 10380 ccttaggcta ggtaatcagc aaatatttgc taaatgagta atagaaactt tgtcaatatc 10440 atagaggttc ttacaataga ttcaaaaggt aattggaacc aacatgcaat aaaaagagat 10500 taacacaaaa atagataagt ggattactat ttaatgaata tttttggaaa aactctattt 10560 cagtttagaa aagaatgaac tgagctccta tattttcatt ctacacattg cagtgaattc 10620 cagatgggac tcaaagactt agcaacaaca acagaacaga acagaataat ttatattttc 10680 tggctcttgc tggggaaata cattctaaat attgaagaaa aaggaaatac tttcagggaa 10740 tgcatttctg aatatatatg taaactttgc aaagcttttt ctgtaatgaa atcttttaat 10800 attttggtca agcaacagaa tggaggaaaa acaaataata agcaaatact tgcttattaa 10860 tatgtttagg aaacacttgg ataacacaca tacacatatg aatatataca cacacacata 10920 tatatatatg ttctacatga tcatactaaa atttagatga cacttcgtcc aggaagaaat 10980 attgaagaaa atgcaatcat acaaacacga cattgccaca ggggaaaaca taacatccat 11040 ggtaaatata acgatacaaa aatctgttac atttagtaaa aagtgcagaa aaccggcaag 11100 aaaaatgcag gagcctgatt ttggtagaaa gtggcaaaaa ggctctaaat gcccagtgca 11160 aatggcttga aattccattc ccaggcctgg ggctccacat ttccaggggc tctgctttcc 11220 aaaagatatg gttcaatgtc ccctaggtat agtgccttcg gtctggactg cagatggaga 11280 cctgggtcat cccagaagct ttgtagtgtg agagatcaaa ctggctgctc tgccaggcct 11340 cacctatttc cagactggat gattcctgat acttttactt ttgagcccaa actaacttct 11400 gcatggaggt tcagtacagg tcaggtgagt gacgaatctc ctgacagggc tacagtgaac 11460 tttcacaagg ttaaaccttc taaacttggg aagctgcttt gcaaaaataa ataaataaaa 11520 taaaaatccc ttactcaatt gttgactggg tattattccg gaaacaggaa acaggttttt 11580 catttctgcc aggttgtgcc ccagcttcca tttgctaagc agcttagggg aacattggat 11640 taaaaaccac ccgtaaagaa aaggcattaa gaaattgatt tgaacgattt gaaattatta 11700 tcttgaaaga tcaaagaaac agtgattatg aacaccggct ctggtgggaa gcggttcttg 11760 aaagttaaag catgggggaa ctcgagtact ctggagcatc atcagcatct gtacaatgca 11820 agtgaactct ggaagctaga cgcatgcttt gcacgtggac acagacacat gcttgcacag 11880 gacacatttg caatacatac atgtttatac atggacacat gttttgcaca cgcttgcaca 11940 ggacacgctt gcacaggaca catgcacagg atgcatgttt tgtacaggac acatgttttg 12000 cacatggaaa cacatatcca tgctttgcaa agggacacag ggaagcatgc ttacaatgga 12060 tacaagttta cacacggacg catgttttgc acaggacata tgcacaggat gcatgttttg 12120 cacacggaca cacacgttta cataggacaa gtgcacagga cacgttttgc accggacgta 12180 ttctgtgcac gggacgcacg gtttgcagat gcaagcacac acactcctcg gcgcgacggg 12240 cgaagcgggg tgggcagcga cagcggagtc cggacccagg gacagcgcct ccgggaacct 12300 cgccaaggcg gcgggaacta caactcccgg ctcggtcgcc tcgggacgac gcgcggccca 12360 gcgcagactg gccccgccca gcggccccgc gcggcccggc ccaggtcccg gcgcccagag 12420 tcgccgcgcg gccgccg 12437 4 133 DNA Homo sapiens 4 ctttctggga ccctgggggg ctatgtcact caggactctt gacttttccc tgagagctga 60 ggatgagccc attgagagca gcatcagaaa accctccctg ggggcttccg gtgcctgccc 120 tggttcttcc agg 133 5 175 DNA Homo sapiens 5 ctgctggtgg tctccggcca gcagaacctg gctgtgcagc agccagctgg cgagatggcc 60 cggcacactg ctgggtggct tgcaaggcag ggagcgatgg aggggctctg gtctcagagg 120 gcccgggaac tgagcagaaa aattgggcag agactcaatc agtcacccga cttag 175 6 119 DNA Homo sapiens 6 agtttggtgt cactggacac gactgtgtgc atgacgggga gcctggggac agggcctgtg 60 actgcatggt cgtggacctt gttccttgcc tgggttccac cctggaggag ggagccctt 119 7 184 DNA Homo sapiens 7 atttggctgc ttactgctca aaagccaaat acgagagaca agagttggtg gcaggtttat 60 ctggagagcc agcaaaccag gaagatggta gataagtgtc taaagtgccg tctaagtcag 120 tacaggtttc gggttcttaa ttatgttaag ggcagtggga aaaagaggct gttgggatca 180 agag 184 8 188 DNA Homo sapiens 8 gtttctgggt ggagcggacc cctgtgcacg aggcagccca gcggggtgag agcctgcagc 60 tgcaacagct gatcgagagc ggcgcctgcg tgaaccaggt caccgtggac tccatcacgc 120 ccctgcacgc agccagtctg cagggccagg cgcggtgtgt gcagctgctg ctggcggctg 180 gggcccag 188 9 123 DNA Homo sapiens 9 cacataatag caaagcaaac cggtgcttct aagaagcatc ttagcaaaga tgaacctgca 60 gcaggatggg aaagtgggtt gcgtggcatt tctctgccct ttggacactt gtttcttcct 120 ccc 123 10 151 DNA Homo sapiens 10 gtggatgctc gcaacatcga cggcagcacc ccgctctgcg atgcctgcgc ctcgggcagc 60 atcgagtgtg tgaagctctt gctgtcctac ggggccaagg tcaaccctcc cctgtacaca 120 gcgtcccccc tgcacgaggc ctgcatgagc g 151 11 135 DNA Homo sapiens 11 ggagttccga atgtgtgagg cttcttattg acgtcggggc caatctggaa gcgcacgatt 60 gccattttgg gacccctctg cacgttgcct gtgcccggga gcatctggac tgtgtcaaag 120 tgctgctcaa tgcag 135 12 351 DNA Homo sapiens 12 cgggtgttta cctgcgtggg tgtttgccca agtgggtgtt tgcccgagtg ggtgtttacc 60 cgagtgggtg tttaccggag tggatgtttg cccgcgtggg tgtttaccgg agtgggtgtt 120 tgcccgcgtg ggtgtttacc ccagtgggtg tttgcctgag tgggtgtttg cccgtgtggg 180 tgtttacctg cgtgggtgtt tacccccggg ggtgtttgcc cgcgtgggtg tttacccgag 240 ggggtgtttg cccgagtggg tgtttgcccg cggcggtgtt taccagagtg gttgtttacc 300 ccagtgggtg tttacccgag ggggtgtttg cctgcgtggg tgtttgcctg c 351 13 192 DNA Homo sapiens 13 gggccaacgt gaatgcggca aagcttcatg agactgccct tcaccacgcg gccaaggtca 60 agaatgttga cctcatcgag atgcttatcg agtttggcgg caacatctac gcccgggaca 120 accgcgggaa gaagccgtct gactacacgt ggagcagcag cgctcccgcc aagtgcttcg 180 agtactacga aa 192 14 47 DNA Homo sapiens 14 gtggggtccg gaccgtggct gcccccgttg tgcccagcat tgcccgg 47 15 607 DNA Homo sapiens 15 atagaacaac gctccttcga gtcccttcct gcgatcctgt ttaggcttct ctcctggatc 60 ctggataatg tttccagggt gttgggaagg cctgcgtctc aggtcacagt tgtgggtgtg 120 gccctgcgct gttctacaga acctaccctc tcaatgggca tgggcccaac catccagttt 180 tcctctttta cggaccatcc tcaaaggcac tctcaggaca gacggcgtgg ggagcacaga 240 ggaggctggc agagctgggg actgagggca ttgttgctga ttctcactca ccggggcagc 300 ctgccgcaga tgcacaggcc ccaggtgcag gccaccacct ccgggtcggc accaggactg 360 ccctcggtgc tcatagggaa tggctgggcc cacggaaggt cggcctggga tgtggcctgg 420 gactgctgct ctgctggctg ctgtgtggat gcttttcctg gagcactttc caaggcatcc 480 cccagcccca agcctgcgcg catctgtcac tcagggactt tctatgggtc tttgtggggg 540 aaggccctgg ctttgtattc cccacaagta gcactgagtt tcttaggaaa tttgtcttca 600 gtattag 607 16 129 DNA Homo sapiens 16 acagtgtgga acagagtgag cctgagctgc ccaaggcgat gccctttgtg ctgtatgact 60 ggaggcctct ccgcgtgggt acttgggggt ggggccacaa atctggcgaa aaacttcatg 120 ctggccgcc 129 17 197 DNA Homo sapiens 17 aaacaccaag atgggcgcat cctgtgtgga tggagagggg tggctctcgt cttgctgtgt 60 cccaggtcgt ttggcagcgc tgtgcttgcc agggagccac actttcccat ctcttcccca 120 ggtctcctga tgccaacttc agcacccccc ttatgtggac ttttcttggg ggagacagac 180 gctgtccacg ggaccag 197 18 636 DNA Homo sapiens 18 gtgggaggcg tcgtgaggga gacagcaaga gaccccagtg ggatccccag tgggatctct 60 tgctcagctc tgggctcccc gtggtcttcc agggagggag tggaatctcg tctactcctg 120 gaccatagca atgctccctg tggtatccca tctgtgctac caggtgagtc tgcagagagc 180 tctgaagtca tttgggatga aaagaatgcc tttccctgtc catttcatgc cctgggccac 240 tttgttaaac tgcttgtgct ttctctccag tgggcaacac ctgtttcttc taggcagttg 300 agttccttta ttcggaaaaa cgagcgtaag actggctctg agaacaaagc tggacagctt 360 gctcatcttt cgagcagctg tgccgtggag agatgaaggt ggggcgcaca gggccctggc 420 aggggctgtg ccccctgtaa tggctgagaa aatatttcca gaccctggag tctttgcctt 480 ttctcttttc tccacattag caccacatga cagtgacagc gaggcctcag tgattttggc 540 ttgaaggtct tgtgatctcc gacaagttga atgaaaagat gtcttaaatt ggtccaatct 600 aaagagtgcc tcctttttct ccaaaccatg aaaaaa 636 19 2071 DNA Homo sapiens 19 gtttaaaaat cacagacctt aggctaggta atcagcaaat atttgctaaa tgagtaatag 60 aaactttgtc aatatcatag aggttcttac aatagattca aaaggtaatt ggaaccaaca 120 tgcaataaaa agagattaac acaaaaatag ataagtggat tactatttaa tgaatatttt 180 tggaaaaact ctatttcagt ttagaaaaga atgaactgag ctcctatatt ttcattctac 240 acattgcagt gaattccaga tgggactcaa agacttagca acaacaacag aacagaacag 300 aataatttat attttctggc tcttgctggg gaaatacatt ctaaatattg aagaaaaagg 360 aaatactttc agggaatgca tttctgaata tatatgtaaa ctttgcaaag ctttttctgt 420 aatgaaatct tttaatattt tggtcaagca acagaatgga ggaaaaacaa ataataagca 480 aatacttgct tattaatatg tttaggaaac acttggataa cacacataca catatgaata 540 tatacacaca cacatatata tatatgttct acatgatcat actaaaattt agatgacact 600 tcgtccagga agaaatattg aagaaaatgc aatcatacaa acacgacatt gccacagggg 660 aaaacataac atccatggta aatataacga tacaaaaatc tgttacattt agtaaaaagt 720 gcagaaaacc ggcaagaaaa atgcaggagc ctgattttgg tagaaagtgg caaaaaggct 780 ctaaatgccc agtgcaaatg gcttgaaatt ccattcccag gcctggggct ccacatttcc 840 aggggctctg ctttccaaaa gatatggttc aatgtcccct aggtatagtg ccttcggtct 900 ggactgcaga tggagacctg ggtcatccca gaagctttgt agtgtgagag atcaaactgg 960 ctgctctgcc aggcctcacc tatttccaga ctggatgatt cctgatactt ttacttttga 1020 gcccaaacta acttctgcat ggaggttcag tacaggtcag gtgagtgacg aatctcctga 1080 cagggctaca gtgaactttc acaaggttaa accttctaaa cttgggaagc tgctttgcaa 1140 aaataaataa ataaaataaa aatcccttac tcaattgttg actgggtatt attccggaaa 1200 caggaaacag gtttttcatt tctgccaggt tgtgccccag cttccatttg ctaagcagct 1260 taggggaaca ttggattaaa aaccacccgt aaagaaaagg cattaagaaa ttgatttgaa 1320 cgatttgaaa ttattatctt gaaagatcaa agaaacagtg attatgaaca ccggctctgg 1380 tgggaagcgg ttcttgaaag ttaaagcatg ggggaactcg agtactctgg agcatcatca 1440 gcatctgtac aatgcaagtg aactctggaa gctagacgca tgctttgcac gtggacacag 1500 acacatgctt gcacaggaca catttgcaat acatacatgt ttatacatgg acacatgttt 1560 tgcacacgct tgcacaggac acgcttgcac aggacacatg cacaggatgc atgttttgta 1620 caggacacat gttttgcaca tggaaacaca tatccatgct ttgcaaaggg acacagggaa 1680 gcatgcttac aatggataca agtttacaca cggacgcatg ttttgcacag gacatatgca 1740 caggatgcat gttttgcaca cggacacaca cgtttacata ggacaagtgc acaggacacg 1800 ttttgcaccg gacgtattct gtgcacggga cgcacggttt gcagatgcaa gcacacacac 1860 tcctcggcgc gacgggcgaa gcggggtggg cagcgacagc ggagtccgga cccagggaca 1920 gcgcctccgg gaacctcgcc aaggcggcgg gaactacaac tcccggctcg gtcgcctcgg 1980 gacgacgcgc ggcccagcgc agactggccc cgcccagcgg ccccgcgcgg cccggcccag 2040 gtcccggcgc ccagagtcgc cgcgcggccg c 2071 20 87 DNA Homo sapiens 20 catctttgcg tgtggcagcg gaagacgcta aatccacctg gacgtcttgt gaatggaggc 60 acgtcgggaa cgcgccttcc cggctca 87 21 186 DNA Homo sapiens 21 ttctgggtgg agcggacccc tgtgcacgag gcagcccagc ggggtgagag cctgcagctg 60 caacagctga tcgagagcgg cgcctgcgtg aaccaggtca ccgtggactc catcacgccc 120 ctgcacgcag ccagtctgca gggccaggcg cggtgtgtgc agctgctgct ggcggctggg 180 gcccag 186 22 150 DNA Homo sapiens 22 gtggatgctc gcaacatcga cggcagcacc ccgctctgcg atgcctgcgc ctcgggcagc 60 atcgagtgtg tgaagctctt gctgtcctac ggggccaagg tcaaccctcc cctgtacaca 120 gcgtcccccc tgcacgaggc ctgcatgagc 150 23 132 DNA Homo sapiens 23 agttccgaat gtgtgaggct tcttattgac gtcggggcca atctggaagc gcacgattgc 60 cattttggga cccctctgca cgttgcctgt gcccgggagc atctggactg tgtcaaagtg 120 ctgctcaatg ca 132 24 189 DNA Homo sapiens 24 gccaacgtga atgcggcaaa gcttcatgag actgcccttc accacgcggc caaggtcaag 60 aatgttgacc tcatcgagat gcttatcgag tttggcggca acatctacgc ccgggacaac 120 cgcgggaaga agccgtctga ctacacgtgg agcagcagcg ctcccgccaa gtgcttcgag 180 tactacgaa 189 25 45 DNA Homo sapiens 25 ggggtccgga ccgtggctgc ccccgttgtg cccagcattg cccgg 45 26 294 DNA Homo sapiens 26 gtcacagttg tgggtgtggc cctgcgctgt tctacagaac ctaccctctc aatgggcatg 60 ggcccaacca tccagttttc ctcttttacg gaccatcctc aaaggcactc tcaggacaga 120 cggcgtgggg agcacagagg aggctggcag agctggggac tgagggcatt gttgctgatt 180 ctcactcacc ggggcagcct gccgcagatg cacaggcccc aggtgcaggc caccacctcc 240 gggtcggcac caggactgcc ctcggtgctc atagggaatg gctgggccca cgga 294 27 75 DNA Homo sapiens 27 tctcctgatg ccaacttcag cacccccctt atgtggactt ttcttggggg agacagacgc 60 tgtccacggg accag 75 28 366 DNA Homo sapiens 28 tgggcaacac ctgtttcttc taggcagttg agttccttta ttcggaaaaa cgagcgtaag 60 actggctctg agaacaaagc tggacagctt gctcatcttt cgagcagctg tgccgtggag 120 agatgaaggt ggggcgcaca gggccctggc aggggctgtg ccccctgtaa tggctgagaa 180 aatatttcca gaccctggag tctttgcctt ttctcttttc tccacattag caccacatga 240 cagtgacagc gaggcctcag tgattttggc ttgaaggtct tgtgatctcc gacaagttga 300 atgaaaagat gtcttaaatt ggtccaatct aaagagtgcc tcctttttct ccaaaccatg 360 aaaaaa 366 29 2071 DNA Homo sapiens 29 gtttaaaaat cacagacctt aggctaggta atcagcaaat atttgctaaa tgagtaatag 60 aaactttgtc aatatcatag aggttcttac aatagattca aaaggtaatt ggaaccaaca 120 tgcaataaaa agagattaac acaaaaatag ataagtggat tactatttaa tgaatatttt 180 tggaaaaact ctatttcagt ttagaaaaga atgaactgag ctcctatatt ttcattctac 240 acattgcagt gaattccaga tgggactcaa agacttagca acaacaacag aacagaacag 300 aataatttat attttctggc tcttgctggg gaaatacatt ctaaatattg aagaaaaagg 360 aaatactttc agggaatgca tttctgaata tatatgtaaa ctttgcaaag ctttttctgt 420 aatgaaatct tttaatattt tggtcaagca acagaatgga ggaaaaacaa ataataagca 480 aatacttgct tattaatatg tttaggaaac acttggataa cacacataca catatgaata 540 tatacacaca cacatatata tatatgttct acatgatcat actaaaattt agatgacact 600 tcgtccagga agaaatattg aagaaaatgc aatcatacaa acacgacatt gccacagggg 660 aaaacataac atccatggta aatataacga tacaaaaatc tgttacattt agtaaaaagt 720 gcagaaaacc ggcaagaaaa atgcaggagc ctgattttgg tagaaagtgg caaaaaggct 780 ctaaatgccc agtgcaaatg gcttgaaatt ccattcccag gcctggggct ccacatttcc 840 aggggctctg ctttccaaaa gatatggttc aatgtcccct aggtatagtg ccttcggtct 900 ggactgcaga tggagacctg ggtcatccca gaagctttgt agtgtgagag atcaaactgg 960 ctgctctgcc aggcctcacc tatttccaga ctggatgatt cctgatactt ttacttttga 1020 gcccaaacta acttctgcat ggaggttcag tacaggtcag gtgagtgacg aatctcctga 1080 cagggctaca gtgaactttc acaaggttaa accttctaaa cttgggaagc tgctttgcaa 1140 aaataaataa ataaaataaa aatcccttac tcaattgttg actgggtatt attccggaaa 1200 caggaaacag gtttttcatt tctgccaggt tgtgccccag cttccatttg ctaagcagct 1260 taggggaaca ttggattaaa aaccacccgt aaagaaaagg cattaagaaa ttgatttgaa 1320 cgatttgaaa ttattatctt gaaagatcaa agaaacagtg attatgaaca ccggctctgg 1380 tgggaagcgg ttcttgaaag ttaaagcatg ggggaactcg agtactctgg agcatcatca 1440 gcatctgtac aatgcaagtg aactctggaa gctagacgca tgctttgcac gtggacacag 1500 acacatgctt gcacaggaca catttgcaat acatacatgt ttatacatgg acacatgttt 1560 tgcacacgct tgcacaggac acgcttgcac aggacacatg cacaggatgc atgttttgta 1620 caggacacat gttttgcaca tggaaacaca tatccatgct ttgcaaaggg acacagggaa 1680 gcatgcttac aatggataca agtttacaca cggacgcatg ttttgcacag gacatatgca 1740 caggatgcat gttttgcaca cggacacaca cgtttacata ggacaagtgc acaggacacg 1800 ttttgcaccg gacgtattct gtgcacggga cgcacggttt gcagatgcaa gcacacacac 1860 tcctcggcgc gacgggcgaa gcggggtggg cagcgacagc ggagtccgga cccagggaca 1920 gcgcctccgg gaacctcgcc aaggcggcgg gaactacaac tcccggctcg gtcgcctcgg 1980 gacgacgcgc ggcccagcgc agactggccc cgcccagcgg ccccgcgcgg cccggcccag 2040 gtcccggcgc ccagagtcgc cgcgcggccg c 2071 30 186 DNA Homo sapiens 30 ttctgggtgg agcggacccc tgtgcacgag gcagcccagc ggggtgagag cctgcagctg 60 caacagctga tcgagagcgg cgcctgcgtg aaccaggtca ccgtggactc catcacgccc 120 ctgcacgcag ccagtctgca gggccaggcg cggtgtgtgc agctgctgct ggcggctggg 180 gcccag 186 31 150 DNA Homo sapiens 31 gtggatgctc gcaacatcga cggcagcacc ccgctctgcg atgcctgcgc ctcgggcagc 60 atcgagtgtg tgaagctctt gctgtcctac ggggccaagg tcaaccctcc cctgtacaca 120 gcgtcccccc tgcacgaggc ctgcatgagc 150 32 132 DNA Homo sapiens 32 agttccgaat gtgtgaggct tcttattgac gtcggggcca atctggaagc gcacgattgc 60 cattttggga cccctctgca cgttgcctgt gcccgggagc atctggactg tgtcaaagtg 120 ctgctcaatg ca 132 33 189 DNA Homo sapiens 33 gccaacgtga atgcggcaaa gcttcatgag actgcccttc accacgcggc caaggtcaag 60 aatgttgacc tcatcgagat gcttatcgag tttggcggca acatctacgc ccgggacaac 120 cgcgggaaga agccgtctga ctacacgtgg agcagcagcg ctcccgccaa gtgcttcgag 180 tactacgaa 189 34 45 DNA Homo sapiens 34 ggggtccgga ccgtggctgc ccccgttgtg cccagcattg cccgg 45 35 294 DNA Homo sapiens 35 gtcacagttg tgggtgtggc cctgcgctgt tctacagaac ctaccctctc aatgggcatg 60 ggcccaacca tccagttttc ctcttttacg gaccatcctc aaaggcactc tcaggacaga 120 cggcgtgggg agcacagagg aggctggcag agctggggac tgagggcatt gttgctgatt 180 ctcactcacc ggggcagcct gccgcagatg cacaggcccc aggtgcaggc caccacctcc 240 gggtcggcac caggactgcc ctcggtgctc atagggaatg gctgggccca cgga 294 36 75 DNA Homo sapiens 36 tctcctgatg ccaacttcag cacccccctt atgtggactt ttcttggggg agacagacgc 60 tgtccacggg accag 75 37 366 DNA Homo sapiens 37 tgggcaacac ctgtttcttc taggcagttg agttccttta ttcggaaaaa cgagcgtaag 60 actggctctg agaacaaagc tggacagctt gctcatcttt cgagcagctg tgccgtggag 120 agatgaaggt ggggcgcaca gggccctggc aggggctgtg ccccctgtaa tggctgagaa 180 aatatttcca gaccctggag tctttgcctt ttctcttttc tccacattag caccacatga 240 cagtgacagc gaggcctcag tgattttggc ttgaaggtct tgtgatctcc gacaagttga 300 atgaaaagat gtcttaaatt ggtccaatct aaagagtgcc tcctttttct ccaaaccatg 360 aaaaaa 366 38 22 DNA Artificial sequence partial sequence of AL365356 38 gaaacagtga ttatgaacac cg 22 39 17 DNA Artificial sequence partial sequence of AL365356 39 gcgaccgagc cgggagt 17 40 17 DNA Artificial sequence partial sequence of AL365356 40 ggagcggacc cctgtgc 17 41 16 DNA Artificial sequence partial sequence of AL365356 41 cagccgccag cagcag 16 42 17 DNA Artificial sequence partial sequence of AL365356 42 cgcaacatcg acggcag 17 43 18 DNA Artificial sequence partial sequence of AL365356 43 caggggggac gctgtgta 18 44 22 DNA Artificial sequence partial sequence of AL365356 44 tgtgtgaggc ttcttattga cg 22 45 22 DNA Artificial sequence partial sequence of AL365356 45 gcagcacttt gacacagtcc ag 22 46 19 DNA Artificial sequence partial sequence of AL365356 46 gagactgccc ttcaccacg 19 47 16 DNA Artificial sequence partial sequence of AL365356 47 agcacttggc gggagc 16 48 15 DNA Artificial sequence partial sequence of AL365356 48 ccggaccgtg gctgc 15 49 16 DNA Artificial sequence partial sequence of AL365356 49 gggcaatgct gggcac 16 50 22 DNA Artificial sequence partial sequence of AL365356 50 tacagaacct accctctcaa tg 22 51 17 DNA Artificial sequence partial sequence of AL365356 51 ctgcacctgg ggcctgt 17 52 19 DNA Artificial sequence partial sequence of AL365356 52 tgatgccaac ttcagcacc 19 53 17 DNA Artificial sequence partial sequence of AL365356 53 cccgtggaca gcgtctg 17 54 23 DNA Artificial sequence partial sequence of AL365356 54 gtttcttcta ggcagttgag ttc 23 55 22 DNA Artificial sequence partial sequence of AL365356 55 ccttcaagcc aaaatcactg ag 22 56 22 DNA Artificial sequence misc_feature (22) partial sequence of AL365356 , n is a, c, g or t 56 tttttttttt tttttttttt vn 22 57 413 DNA Artificial sequence partial sequence of AL365356 57 gaggcagccc agcggggtga gagcctgcag ctgcaacagc tgatcgagag cggcgcctgc 60 gtgaaccagg tcaccgtgga ctccatcacg cccctgcacg cagccagtct gcagggccag 120 gcgcggtgtg tgcagctgct gctggcggct ggggcccagg tggatgctcg caacatcgac 180 ggcagcaccc cgctctgcga tgcctgcgcc tcgggcagca tcgagtgtgt gaagctcttg 240 ctgtcctacg gggccaaggt caaccctccc ctgtacacag cgtcccccct gcacgaggcc 300 tgcatgagcg ggagttccga atgtgtgagg cttcttattg acgtcggggc caatctggaa 360 gcgcacgatt gccattttgg gacccctctg cacgttgcct gtgcccggga gca 413 58 549 DNA Artificial sequence partial sequence of AL365356 58 cttcttcccg cggttgtccc gggcgtagat gttgccgcca aactcgataa gcatctcgat 60 gaggtcaaca ttcttgacct tggccgcgtg gtgaagggca gtctcatgaa gctttgccgc 120 attcacgttg gcccctgcat tgagcagcac tttgacacag tccagatgct cccgggcaca 180 ggcaacgtgc agaggggtcc caaaatggca atcgtgcgct tccagattgg ccccgacgtc 240 aataagaagc ctcacacatt cggaactccc gctcatgcag gcctcgtgca ggggggacgc 300 tgtgtacagg ggagggttga ccttggcccc gtaggacagc aagagcttca cacactcgat 360 gctgcccgag gcgcaggcat cgcagagcgg ggtgctgccg tcgatgttgc gagcatccac 420 ctgggcccca gccgccagca gcagctgcac acaccgcgcc tggccctgca gactggctgc 480 gtgcaggggc gtgatggagt ccacggtgac ctggttcacg caggcgccgc tctcgatcag 540 ctgttgcag 549 59 616 DNA Artificial sequence partial sequence of AL365356 59 tgcaacagct gatcgagagc ggcgcctgcg tgaaccaggt caccgtggac tccatcacgc 60 ccctgcacgc agccagtctg cagggccagg cgcggtgtgt gcagctgctg ctggcggctg 120 gggcccaggt ggatgctcgc aacatcgacg gcagcacccc gctctgcgat gcctgcgcct 180 cgggcagcat cgagtgtgtg aagctcttgc tgtcctacgg ggccaaggtc aaccctcccc 240 tgtacacagc gtcccccctg cacgaggcct gcatgagcgg gagttccgaa tgtgtgaggc 300 ttcttattga cgtcggggcc aatctggaag cgcacgattg ccattttggg acccctctgc 360 acgttgcctg tgcccgggag catctggact gtgtcaaagt gctgctcaat gcaggggcca 420 acgtgaatgc ggcaaagctt catgagactg cccttcacca cgcggccaag gtcaagaatg 480 ttgacctcat cgagatgctt atcgagtttg gcggcaacat ctacgcccgg gacaaccgcg 540 ggaagaagcc gtctgactac acgtggagca gcagcgctcc cgccaagtgc ttcgagtact 600 acgaaaagac acctct 616 60 1118 DNA Artificial sequence partial sequence of AL365356 60 caacagctga tcgagagcgg cgcctgcgtg aaccaggtca ccgtggactc catcacgccc 60 ctgcacgcag ccagtctgca gggccaggcg cggtgtgtgc agctgctgct ggcggctggg 120 gcccaggtgg atgctcgcaa catcgacggc agcaccccgc tctgcgatgc ctgcgcctcg 180 ggcagcatcg agtgtgtgaa gctcttgctg tcctacgggg ccaaggtcaa ccctcccctg 240 tacacagcgt cccccctgca cgaggcctgc atgagcggga gttccgaatg tgtgaggctt 300 cttattgacg tcggggccaa tctggaagcg cacgattgcc attttgggac ccctctgcac 360 gttgcctgtg cccgggagca tctggactgt gtcaaagtgc tgctcaatgc aggggccaac 420 gtgaatgcgg caaagcttca tgagactgcc cttcaccacg cggccaaggt caagaatgtt 480 gacctcatcg agatgcttat cgagtttggc ggcaacatct acgcccggga caaccgcggg 540 aagaagccgt ctgactacac gtggagcagc agcgctcccg ccaagtgctt cgagtactac 600 gaaaagacac ctctgactct gtcacagctc tgcagggtga acttgaggaa ggccactggc 660 gtccgagggc tggagaagat tgccaagtta aacatcccgc cccggctcat tgattacctc 720 tcctacaact gaattgcagg tggggtccgg accgtgactg cccccgttgt gcccagcatt 780 gcccgggtga gggctctgcc tgttcctctg aagcagcgtg attgctgtag atagaacaac 840 gctccttcga gtcccttcct gcgatcctgt ttaggcttct ctcctggatc ctggataatg 900 tttccagggt gttgggaagg cctgcgtctc aggtcacagt tgtgggtgtg gccctgcgct 960 gttctacaga acctaccctc tcaatgggca tgggcccaac catccagttt tcctctttta 1020 cggaccatcc tcaaaggcac tctcaggaca gacggcgtgg ggagcacaga ggaggctggc 1080 agagctgggg actcagggca ttgttgctga ttctcact 1118 61 550 DNA Artificial sequence partial sequence of AL365356 61 catcgagtgt gtgaagctct tgctgtccta cggggccaag gtcaaccctc ccctgtacac 60 agcgtccccc ctgcacgagg cctgcatgag cgggagttcc gaatgtgtga ggcttcttat 120 tgacgtcggg gccaatctgg aagcgcacga ttgccatttt gggacccctc tgcacgttgc 180 ctgtgcccgg gagcatctgg actgtgtcaa agtgctgctc aatgcagggg ccaacgtgaa 240 tgcggcaaag cttcatgaga ctgcccttca ccacgcggcc aaggtcaaga atgttgacct 300 catcgagatg cttatcgagt ttggcggcaa catctacgcc cgggacaacc gcgggaagaa 360 gccgtctgac tacacgtgga gcagcagcgc tcccgccaag tgcttcgagt actacgaaaa 420 gacacctctg actctgtcac agctctgcag ggtgaacttg aggaaggcca ctggcgtccg 480 agggctggag aagattgcca agttaaacat cccgccccgg ctcattgatt acctctccta 540 caactgaatt 550 62 926 DNA Artificial sequence misc_feature (909) partial sequence of AL365356 , n is a, c, g or t 62 catcgagtgt gtgaagctct tgctgtccta cggggccaag gtcaaccctc ccctgtacac 60 agcgtccccc ctgcacgagg cctgcatgag cgggagttcc gaatgtgtga ggcttcttat 120 tgacgtcggg gccaatctgg aagcgcacga ttgccatttt gggacccctc tgcacgttgc 180 ctgtgcccgg gagcatctgg actgtgtcaa agtgctgctc aatgcagggg ccaacgtgaa 240 tgcggcaaag cttcatgaga ctgcccttca ccacgcggcc aaggtcaaga atgttgacct 300 catcgagatg cttatcgagt ttggcggcaa catctacgcc cgggacaacc gcgggaagaa 360 gccgtctgac tacacgtgga gcagcagcgc tcccgccaag tgcttcgagt actacgaaaa 420 gacacctctg actctgtcac agctctgcag ggtgaacttg aggaaggcca ctggcgtccg 480 agggctggag aagattgcca agttaaacat cccgccccgg ctcattgatt acctctccta 540 caactgaatt gcaggtgggg tccggaccgt gactgccccc gttgtgccca gcattgcccg 600 ggtgagggct ctgcctgttc ctctgaagca gcgtgattgc tgtagataga acaacgctcc 660 ttcgagtccc ttcctgcgat cctgtttagg cttctctcct ggatcctgga taatgtttcc 720 agggtgttgg gaaggcctgc gtctcaggtc acagttgtgg gtgtggccct gcgctgttct 780 acagaaccta ccctctcaat gggcatgggc ccaaccatcc agttttcctc ttttacggac 840 catcctcaaa ggcactctca ggacagacgg cgtggggagc acagaggagg ctggcagagc 900 tggggactna gggcattgtt gctgat 926 63 796 DNA Artificial sequence misc_feature (188) partial sequence of AL365356 , n is a, c, g or t 63 tgcggcaggc tgccccggtg agtgagaatc agcaacaatg ccctgagtcc ccagctctgc 60 cagcctcctc tgtgctcccc acgccgtctg tcctgagagt gcctttgagg atggtccgta 120 aaagaggaaa actggatggt tgggcccatg cccattgaga gggtaggttc tgtagaacag 180 cgcagggnca cacccacaac tgtgacctga gacgcaggcc ttcccaacac cctggaaaca 240 ttatccagga tccaggagag aagcctaaac aggatcgcag gaagggactc gaaggagcgt 300 tgttctatct acagcaatca cgctgcttca gaggaacagg cagagccctc acccgggcaa 360 tgctgggcac aacgggggca gtcacggtcc ggaccccacc tgcaattcag ttgtaggaga 420 ggtaatcaat gagccggggc gggatgttta acttggcaat cttctccagc cctcggacgc 480 cagtggcctt cctcaagttc accctgcaga gctgtgacag agtcagaggt gtcttttcgt 540 agtactcgaa gcacttggcg ggagcgctgc tgctccacgt gtagtcagac ggcttcttcc 600 cgcggttgtc ccgggcgtag atgttgccgc caaactcgat aagcatctcg atgaggtcaa 660 cattcttgac cttggccgcg tggtgaaggg cagtctcatg aagctttgcc gcattcacgt 720 tggcccctgc attgagcagc actttgacac agtccagatg ctcccgggca caggcaacgt 780 gcagaggggt cccaaa 796 64 1251 DNA Artificial sequence partial sequence of AL365356 64 gaggcagccc agcggggtga gagcctgcag ctgcaacagc tgatcgagag cggcgcctgc 60 gtgaaccagg tcaccgtgga ctccatcacg cccctgcacg cagccagtct gcagggccag 120 gcgcggtgtg tgcagctgct gctggcggct ggggcccagg tggatgctcg caacatcgac 180 ggcagcaccc cgctctgcga tgcctgcgcc tcgggcagca tcgagtgtgt gaagctcttg 240 ctgtcctacg gggccaaggt caaccctccc ctgtacacag cgtcccccct gcacgaggcc 300 tgcatgagcg ggagttccga atgtgtgagg cttcttattg acgtcggggc caatctggaa 360 gcgcacgatt gccattttgg gacccctctg cacgttgcct gtgcccggga gcatctggac 420 tgtgtcaaag tgctgctcaa tgcaggggcc aacgtgaatg cggcaaagct tcatgagact 480 gcccttcacc acgcggccaa ggtcaagaat gttgacctca tcgagatgct tatcgagttt 540 ggcggcaaca tctacgcccg ggacaaccgc gggaagaagc cgtctgacta cacgtggagc 600 agcagcgctc ccgccaagtg cttcgagtac tacgaaaaga cacctctgac tctgtcacag 660 ctctgcaggg tgaacttgag gaaggccact ggcgtccgag ggctggagaa gattgccaag 720 ttaaacatcc cgccccggct cattgattac ctctcctaca actgaattgc aggtggggtc 780 cggaccgtga ctgcccccgt tgtgcccagc attgcccggg tgagggctct gcctgttcct 840 ctgaagcagc gtgattgctg tagatagaac aacgctcctt cgagtccctt cctgcgatcc 900 tgtttaggct tctctcctgg atcctggata atgtttccag ggtgttggga aggcctgcgt 960 ctcaggtcac agttgtgggt gtggccctgc gctgttctac agaacctacc ctctcaatgg 1020 gcatgggccc aaccatccag ttttcctctt ttacggacca tcctcaaagg cactctcagg 1080 acagacggcg tggggagcac agaggaggct ggcagagctg gggactcagg gcattgttgc 1140 tgattctcac tcaccggggc agcctgccgc agatgcacag gccccaggtg caggccacca 1200 cctccgggtc ggcaccagga ctgccctcgg tgctcatagg gaatggctgg g 1251 65 462 DNA Artificial sequence partial sequence of AL365356 65 atgagcggga gttccgaatg tgtgaggctt cttattgacg tcggggccaa tctggaagcg 60 cacgattgcc attttgggac ccctctgcac gttgcctgtg cccgggagca tctggactgt 120 gtcaaagtgc tgctcaatgc aggggccaac gtgaatgcgg caaagcttca tgagactgcc 180 cttcaccacg cggccaaggt caagaatgtt gacctcatcg agatgcttat cgagtttggc 240 ggcaacatct acgcccggga caaccgcggg aagaagccgt ctgactacac gtggagcagc 300 agcgctcccg ccaagtgctt cgagtactac gaaaagacac ctctgactct gtcacagctc 360 tgcagggtga acttgaggaa ggccactggc gtccgagggc tggagaagat tgccaagtta 420 aacatcccgc cccggctcat tgattacctc tcctacaact ga 462 

What is claimed is:
 1. A method for presentation of sequences, comprising the steps of: extracting a first partial sequence from a mRNA; extracting a second partial sequence from a database by searching, the second partial sequence corresponding to the first partial sequence; predicting a first exon region within the second partial sequence using a first program; predicting a second exon region within the second partial sequence using a second program; and extracting a common region between the first exon region and the second exon region as a common sequence.
 2. A method according to claim 1 wherein: a plurality of the first exon regions and a plurality of the second exon regions are predicted, and common regions among the plurality of first exon regions and the plurality of second exon regions are extracted.
 3. A method according to claim 1 further comprising the steps of: predicting a third exon region within the second partial sequence using a third program; and extracting a common region among the first, second and third exon regions as a common sequence.
 4. A display system comprising: means for displaying a first partial sequence derived from a mRNA; means for displaying a second partial sequence derived from a genome sequence in a database, the second partial sequence corresponding to the first partial sequence; a selection button for selecting a plurality of different programs including first and second programs; means for displaying exon regions of the second partial sequence, the exon regions being extracted through the use of the selected plurality of different programs; means for displaying a first exon region extracted through the use of the first program; means for displaying a second exon region extracted through the use of the second program; and a common sequence extraction button for extracting a sequence common to the first exon region and the second exon region.
 5. A display system according to claim 4 wherein: the common sequence extraction button is a button for extracting common sequence(s), and the system further comprises selecting means for extracting a 5′ end sequence of any one of common sequence(s) and a 3′ end sequence of any one of common sequence(s) as a set of primers.
 6. A display system according to claim 5 wherein the selecting means comprises means for selecting the length of a sequence to be amplified.
 7. A display system according to claim 5 further comprising a sequence displaying means for displaying a sequence of the set of primers.
 8. A display system according to claim 5 wherein the selecting means comprises means for extracting a plurality of primer sets.
 9. A display system according to claim 5 further comprising sequence displaying means for displaying a sequence of a region sandwiched between two members of the selected set of primers.
 10. A display system according to claim 4 further comprising a minority sequence extraction button for extracting an exon region that is predicted through the use of one of the first and second programs but is not predicted through the use of the other.
 11. A method comprising the steps of: extracting a first partial sequence from a mRNA; identifying a second partial sequence corresponding to the first partial sequence from among a genome sequence; identifying common sequence(s) among exon regions within the second partial sequence, the exon regions being predicted through the use of a plurality of exon prediction programs; selecting a combination of a 5′ end sequence and a 3′ end sequence from the common sequence(s); and designing a set of primers based on the selected combination of the 5′ end and 3′ end sequences.
 12. A method according to claim 11 wherein the combination is selected based on the length of a sequence to be amplified.
 13. A method according to claim 11 for cloning further comprising the steps of: performing amplification using said set of primers; and cloning the resulting amplified gene. 